在足球(或协会足球)中,球员迅速从英雄转变为零,反之亦然。性能不是静态度量,而是一种易变的措施。将绩效分析为时间序列而不是静止的时间点对于做出更好的决策至关重要。本文介绍并探讨了I-VAEP和O-VAEP模型,以评估行动和评估玩家的意图和执行。然后,我们随着时间的推移分析这些评级,并提出用例,以基本将我们将玩家评分视为连续问题的选择。结果,我们出席了谁是最好的球员以及他们的表现如何发展,定义波动率指标以衡量球员的一致性,并建立玩家发展曲线以帮助决策。
translated by 谷歌翻译
Landing an unmanned aerial vehicle unmanned aerial vehicle (UAV) on top of an unmanned surface vehicle (USV) in harsh open waters is a challenging problem, owing to forces that can damage the UAV due to a severe roll and/or pitch angle of the USV during touchdown. To tackle this, we propose a novel model predictive control (MPC) approach enabling a UAV to land autonomously on a USV in these harsh conditions. The MPC employs a novel objective function and an online decomposition of the oscillatory motion of the vessel to predict, attempt, and accomplish the landing during near-zero tilt of the landing platform. The nonlinear prediction of the motion of the vessel is performed using visual data from an onboard camera. Therefore, the system does not require any communication with the USV or a control station. The proposed method was analyzed in numerous robotics simulations in harsh and extreme conditions and further validated in various real-world scenarios.
translated by 谷歌翻译
Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings. In most cases, the estimated distribution sums to 1 over all finite strings. However, in some pathological cases, probability mass can ``leak'' onto the set of infinite sequences. In order to characterize the notion of leakage more precisely, this paper offers a measure-theoretic treatment of language modeling. We prove that many popular language model families are in fact tight, meaning that they will not leak in this sense. We also generalize characterizations of tightness proposed in previous works.
translated by 谷歌翻译
After just a few hundred training updates, a standard probabilistic model for language generation has likely not yet learnt many semantic or syntactic rules of natural language, which inherently makes it difficult to estimate the right probability distribution over next tokens. Yet around this point, these models have identified a simple, loss-minimising behaviour: to output the unigram distribution of the target training corpus. The use of such a crude heuristic raises the question: Rather than wasting precious compute resources and model capacity for learning this strategy at early training stages, can we initialise our models with this behaviour? Here, we show that we can effectively endow our model with a separate module that reflects unigram frequency statistics as prior knowledge. Standard neural language generation architectures offer a natural opportunity for implementing this idea: by initialising the bias term in a model's final linear layer with the log-unigram distribution. Experiments in neural machine translation demonstrate that this simple technique: (i) improves learning efficiency; (ii) achieves better overall performance; and (iii) appears to disentangle strong frequency effects, encouraging the model to specialise in non-frequency-related aspects of language.
translated by 谷歌翻译
In this work, we investigate the representation capacity of multilayer perceptron networks that use the sine as activation function - sinusoidal neural networks. We show that the layer composition in such networks compacts information. For this, we prove that the composition of sinusoidal layers expands as a sum of sines consisting of a large number of new frequencies given by linear combinations of the weights of the network's first layer. We provide the expression of the corresponding amplitudes in terms of the Bessel functions and give an upper bound for them that can be used to control the resulting approximation.
translated by 谷歌翻译
In this paper, we seek to measure how much information a component in a neural network could extract from the representations fed into it. Our work stands in contrast to prior probing work, most of which investigates how much information a model's representations contain. This shift in perspective leads us to propose a new principle for probing, the architectural bottleneck principle: In order to estimate how much information a given component could extract, a probe should look exactly like the component. Relying on this principle, we estimate how much syntactic information is available to transformers through our attentional probe, a probe that exactly resembles a transformer's self-attention head. Experimentally, we find that, in three models (BERT, ALBERT, and RoBERTa), a sentence's syntax tree is mostly extractable by our probe, suggesting these models have access to syntactic information while composing their contextual representations. Whether this information is actually used by these models, however, remains an open question.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
我们将图形神经网络训练来自小工具N体模拟的光晕目录的神经网络,以执行宇宙学参数的无现场级别可能的推断。目录包含$ \ Lessim $ 5,000 HAROS带质量$ \ gtrsim 10^{10} 〜h^{ - 1} m_ \ odot $,定期卷为$(25〜H^{ - 1} {\ rm mpc}){\ rm mpc}) ^3 $;目录中的每个光环都具有多种特性,例如位置,质量,速度,浓度和最大圆速度。我们的模型构建为置换,翻译和旋转的不变性,不施加最低限度的规模来提取信息,并能够以平均值来推断$ \ omega _ {\ rm m} $和$ \ sigma_8 $的值$ \ sim6 \%$的相对误差分别使用位置加上速度和位置加上质量。更重要的是,我们发现我们的模型非常强大:他们可以推断出使用数千个N-n-Body模拟的Halo目录进行测试时,使用五个不同的N-进行测试时,在使用Halo目录进行测试时,$ \ omega _ {\ rm m} $和$ \ sigma_8 $身体代码:算盘,Cubep $^3 $ M,Enzo,PKDGrav3和Ramses。令人惊讶的是,经过培训的模型推断$ \ omega _ {\ rm m} $在对数千个最先进的骆驼水力动力模拟进行测试时也可以使用,该模拟使用四个不同的代码和子网格物理实现。使用诸如浓度和最大循环速度之类的光环特性允许我们的模型提取更多信息,而牺牲了模型的鲁棒性。这可能会发生,因为不同的N体代码不会在与这些参数相对应的相关尺度上收敛。
translated by 谷歌翻译
酒吧 - 希利尔的结构是正式语言理论的经典结果。它通过构造表明,无上下文语言与普通语言之间的相交本身是无上下文的。但是,其原始配方(Bar-Hillel等人,1961年)都不是其加权扩展(Nederhof和Satta,2003年)都无法使用$ \ epsilon $ -Arcs处理自动机。在此简短的说明中,我们将Bar-Hillel结构概括为即使自动机包含$ \ epsilon $ -Arcs,也可以正确计算交叉路口。我们进一步证明,我们的广义结构导致语法编码输入自动机和语法的结构,同时保留原始结构的渐近尺寸。
translated by 谷歌翻译
计算方法开始用于设计数据和生成过程所推动的动态视觉身份。在这项工作中,我们探索了这些计算方法,以生成创建定制效率和图像的视觉标识。我们实现了开发的生成设计系统,该设计系统会自动组装黑白视觉模块。该系统生成设计执行两种主要方法的设计:(i)辅助生成;(ii)自动生成。辅助生成方法产生输出,其中模块的放置由以前定义的配置文件确定。另一方面,自动生成方法会产生输出,其中组装模块以描绘输入图像。该系统加快了一个视觉标识设计的设计和部署的过程,并在它们之间生成了视觉连贯性。在本文中,我们可以压制地描述该系统及其成就。
translated by 谷歌翻译